minicbor-derive 0.4.1

Procedural macros to derive minicbor's `Encode` and `Decode` traits. Deriving is supported for `struct`s and `enum`s. The encoding is optimised for forward and backward compatibility and the overall approach is influenced by [Google's Protocol Buffers][1]. The goal is that ideally a change to a type still allows older software, which is unaware of the changes, to decode values of the changed type (forward compatibility) and newer software, to decode values of types encoded by older software, which do not include the changes made to the type (backward compatibility). In order to reach this goal, the encoding has the following characteristics: 1. The encoding does not contain any names, i.e. no field names, type names or variant names. Instead, every field and every constructor needs to be annotated with an (unsigned) index number, e.g. `#[n(1)]`. 2. Unknown fields are ignored during decoding. 3. Optional types default to `None` if their value is not present during decoding. 4. Optional enums default to `None` if an unknown variant is encountered during decoding. Item **1** ensures that names can be changed freely without compatibility concerns. Item **2** ensures that new fields do not affect older software. Item **3** ensures that newer software can stop producing optional values. Item **4** ensures that enums can get new variants that older software is not aware of. By "fields" we mean the elements of structs and tuple structs as well as enum structs and enum tuples. In addition, it is a compatible change to turn a unit variant into a struct or tuple variant if all fields are optional. From the above it should be obvious that *non-optional fields need to be present forever*, so they should only be part of a type after careful consideration. It should be emphasised that an `enum` itself can not be changed in a compatible way. An unknown variant causes an error. It is only when they are declared as an optional field type that unknown variants of an enum are mapped to `None`. In other words, *only structs can be used as top-level types in a forward and backward compatible way, enums can not.* # Example ``` use minicbor::{Encode, Decode}; #[derive(Encode, Decode)] struct Point { #[n(0)] x: f64, #[n(1)] y: f64 } #[derive(Encode, Decode)] struct ConvexHull { #[n(0)] left: Point, #[n(1)] right: Point, #[n(2)] points: Vec, #[n(3)] state: Option } #[derive(Encode, Decode)] enum State { #[n(0)] Start, #[n(1)] Search { #[n(0)] info: u64 } } ``` In this example the following changes would be compatible in both directions: - Renaming every identifier. - Adding optional fields to `Point`, `ConvexHull`, `State::Start` or `State::Search`. - Adding more variants to `State` *iff* `State` is only decoded as part of `ConvexHull`. Direct decoding of `State` would produce an `UnknownVariant` error for those new variants. [1]: https://developers.google.com/protocol-buffers/ # Attributes and borrowing Each field and variant needs to be annotated with an index number, which is used instead of the name, using either **`n`** or **`b`** as attribute names. For the encoding it makes no difference which one to choose. For decoding, `b` indicates that the value borrows from the decoding input, whereas `n` produces non-borrowed values (except for implicit borrows). ## Encoding format The actual CBOR encoding to use can be selected by attaching either the **`#[cbor(array)]`** or **`#[cbor(map)]`** attribute to structs, enums or enum variants. By default `#[cbor(array)]` is implied. The attribute attached to an enum applies to all its variants but can be overriden per variant with another such attribute. ## Implicit borrowing The following types implicitly borrow from the decoding input, which means their lifetimes are constrained by the input lifetime: - `&'_ str` - `&'_ [u8]` - `Option<&'_ str>` - `Option<&'_ [u8]>` ## Explicit borrowing If a type is annotated with **`#[b(...)]`**, all its lifetimes will be constrained to the input lifetime. If the type is a `std::borrow::Cow<'_, str>` or `std::borrow::Cow<'_, [u8]>` type, the generated code will decode the inner type and construct a `Cow::Borrowed` variant, contrary to the `Cow` impl of `Decode` which produces owned values. ## Other attributes ### `encode_with`, `decode_with` and `with` Fields in structs and enum variants may be annotated with **`#[cbor(encode_with = "path")]`**, **`#[cbor(decode_with = "path")]`** or **`#[cbor(with = "module-path")]`** where `path` is the full path to a function which is used instead of `Encode::encode` to encode the field or `Decode::decode` to decode the field respectively. The types of these functions must be equivalent to `Encode::encode` or `Decode::decode`. The `with` attribute combines the other two with `module-path` denoting the full path to a module with two functions `encode` and `decode` as members, which are used for encoding and decoding of the field. These three attributes can either override an existing `Encode` or `Decode` impl or be used for types which do not implement those traits at all. ### `transparent` A **`#[cbor(transparent)]`** attribute can be attached to structs with exactly one field (aka newtypes). If present, the generated `Encode` and `Decode` impls will just forward the `encode`/`decode` calls to the inner type, i.e. the resulting CBOR representation will be identical to the one of the inner type. # CBOR encoding The CBOR values produced by a derived `Encode` implementation are of the following formats. ## Structs ### Array encoding By default or if a struct has the **`#[cbor(array)]`** attribute, it will be represented as a CBOR array. Its index numbers are represened by the position of the field value in this array. Any gaps between index numbers are filled with CBOR NULL values and `Option`s which are `None` likewise end up as NULLs in this array. ```text <> = `array(n)` item_0 item_1 ... item_n ``` ### Map encoding If a struct has the **`#[cbor(map)]`** attribute, then it will be represented as a CBOR map with keys corresponding to the numeric index value: ```text <> = `map(n)` `0` item_0 `1` item_1 ... n item_n ``` Optional fields whose value is `None` are not encoded. ## Enums Each enum variant is encoded as a two-element array. The first element is the variant index and the second the actual variant value: ```text <> = | `array(2)` n <> ; if #[cbor(array)] | `array(2)` n <> ; if #[cbor(map)] ``` ## Which encoding to use? The map encoding needs to represent the indexes explicitly in the encoding which costs at least one extra byte per field value, whereas the array encoding does not need to encode the indexes. On the other hand, absent values, i.e. `None`s and gaps between indexes are not encoded with maps but need to be encoded explicitly with arrays as NULLs which need one byte each. Which encoding to choose depends therefore on the nature of the type that should be encoded: - *Dense types* are types which contain only few `Option`s or their `Option`s are assumed to be `Some`s usually. They are best encoded as arrays. - *Sparse types* are types with many `Option`s and their `Option`s are usually `None`s. They are best encoded as maps. When selecting the encoding, future changes to the type should be considered as they may turn a dense type into a sparse one over time.